20 research outputs found
Attention Is (not) All You Need for Commonsense Reasoning
The recently introduced BERT model exhibits strong performance on several
language understanding benchmarks. In this paper, we describe a simple
re-implementation of BERT for commonsense reasoning. We show that the
attentions produced by BERT can be directly utilized for tasks such as the
Pronoun Disambiguation Problem and Winograd Schema Challenge. Our proposed
attention-guided commonsense reasoning method is conceptually simple yet
empirically powerful. Experimental analysis on multiple datasets demonstrates
that our proposed system performs remarkably well on all cases while
outperforming the previously reported state of the art by a margin. While
results suggest that BERT seems to implicitly learn to establish complex
relationships between entities, solving commonsense reasoning tasks might
require more than unsupervised models learned from huge text corpora.Comment: to appear at ACL 201